智能论文笔记

FedChain: Chained Algorithms for Near-Optimal Communication Cost in Federated Learning

Charlie Hou , Kiran K. Thekumparampil , Giulia Fanti , Sewoong Oh

分类：机器学习

2021-08-16

联合学习（FL）旨在最大程度地减少培训模型的沟通复杂性，而不是在许多客户中分发的异质数据。一种常见的方法是本地方法，在与服务器通信之前，客户端在本地数据（例如FedAvg）之前对本地数据进行了多个优化步骤。本地方法可以利用客户数据之间的相似性。但是，在现有的分析中，这是以依赖对通信的数量的依赖为代价的。另一方面，全球方法，客户只是在每个回合中返回梯度向量（例如，SGD），以R的速度更快，但即使客户均匀，也无法利用客户之间的相似性。我们提出了FedChain，这是一种算法框架，结合了本地方法和全球方法的优势，以实现R的快速收敛，同时利用客户之间的相似性。使用Fedchain，我们实例化了在一般凸和PL设置中先前已知的速率改进的算法，并且在满足强凸度的问题方面几乎是最佳的（通过我们显示的算法独立的下限）。经验结果支持现有方法的理论增益。

translated by 谷歌翻译

UAV-based Visual Remote Sensing for Automated Building Inspection

Kushagra Srivastava , Dhruv Patel , Aditya Kumar Jha , Mohhit Kumar Jha , Jaskirat Singh , Ravi Kiran Sarvadevabhatla , Pradeep Kumar Ramancharla , Harikumar Kandath , K. Madhava Krishna

分类：计算机视觉 | 机器人

2022-09-27

与计算机视觉合并的基于无人机的遥感系统（UAV）遥感系统具有协助建筑物建设和灾难管理的潜力，例如地震期间的损害评估。可以通过检查来评估建筑物到地震的脆弱性，该检查考虑到相关组件的预期损害进展以及组件对结构系统性能的贡献。这些检查中的大多数是手动进行的，导致高利用人力，时间和成本。本文提出了一种通过基于无人机的图像数据收集和用于后处理的软件库来自动化这些检查的方法，该方法有助于估算地震结构参数。这里考虑的关键参数是相邻建筑物，建筑计划形状，建筑计划区域，屋顶上的对象和屋顶布局之间的距离。通过使用距离测量传感器以及通过Google Earth获得的数据进行的现场测量，可以验证所提出的方法在估计上述参数估算上述参数方面的准确性。可以从https://uvrsabi.github.io/访问其他详细信息和代码。

translated by 谷歌翻译

A transformer-based deep learning approach for classifying brain metastases into primary organ sites using clinical whole brain MRI

Qing Lyu , Sanjeev V. Namjoshi , Emory McTyre , Umit Topaloglu , Richard Barcus , Michael D. Chan , Christina K. Cramer , Waldemar Debinski , Metin N. Gurcan , Glenn J. Lesser

分类：计算机视觉

2021-10-07

脑转移性疾病的治疗决策依赖于主要器官位点的知识，目前用活组织检查和组织学进行。在这里，我们开发了一种具有全脑MRI数据的准确非侵入性数字组织学的新型深度学习方法。我们的IRB批准的单网回顾性研究由患者（n = 1,399）组成，提及MRI治疗规划和伽马刀放射牢房超过19年。对比增强的T1加权和T2加权流体减毒的反转恢复脑MRI考试（n = 1,582）被预处理，并输入肿瘤细分，模态转移和主要部位分类的建议深度学习工作流程为五个课程之一（肺，乳腺，黑色素瘤，肾等）。十倍的交叉验证产生的总体AUC为0.947（95％CI：0.938,0.955），肺类AUC，0.899（95％CI：0.884,0.915），乳房类AUC为0.990（95％CI：0.983,0.997），黑色素瘤ACAC为0.882（95％CI：0.858,0.906），肾类AUC为0.870（95％CI：0.823,0.918），以及0.885的其他AUC（95％CI：0.843,0.949）。这些数据确定全脑成像特征是判别的，以便准确诊断恶性肿瘤的主要器官位点。我们的端到端深度射出方法具有巨大的分类来自全脑MRI图像的转移性肿瘤类型。进一步的细化可以提供一种无价的临床工具，以加快对精密治疗和改进的结果的原发性癌症现场鉴定。

translated by 谷歌翻译

Computing the Performance of A New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards

James K. He , Sofía S. Villar , Lida Mavrogonatou

分类：机器学习

2023-01-03

Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.

translated by 谷歌翻译

Design and analysis of tweet-based election models for the 2021 Mexican legislative election

Alejandro Vigna-Gómez , Javier Murillo , Manelik Ramirez , Alberto Borbolla , Ian Márquez , Prasun K. Ray

分类：自然语言处理

2023-01-02

Modelling and forecasting real-life human behaviour using online social media is an active endeavour of interest in politics, government, academia, and industry. Since its creation in 2006, Twitter has been proposed as a potential laboratory that could be used to gauge and predict social behaviour. During the last decade, the user base of Twitter has been growing and becoming more representative of the general population. Here we analyse this user base in the context of the 2021 Mexican Legislative Election. To do so, we use a dataset of 15 million election-related tweets in the six months preceding election day. We explore different election models that assign political preference to either the ruling parties or the opposition. We find that models using data with geographical attributes determine the results of the election with better precision and accuracy than conventional polling methods. These results demonstrate that analysis of public online data can outperform conventional polling methods, and that political analysis and general forecasting would likely benefit from incorporating such data in the immediate future. Moreover, the same Twitter dataset with geographical attributes is positively correlated with results from official census data on population and internet usage in Mexico. These findings suggest that we have reached a period in time when online activity, appropriately curated, can provide an accurate representation of offline behaviour.

translated by 谷歌翻译

Federated Learning with Client-Exclusive Classes

Jiayun Zhang , Xiyuan Zhang , Xinyang Zhang , Dezhi Hong , Rajesh K. Gupta , Jingbo Shang

分类：机器学习

2023-01-01

Existing federated classification algorithms typically assume the local annotations at every client cover the same set of classes. In this paper, we aim to lift such an assumption and focus on a more general yet practical non-IID setting where every client can work on non-identical and even disjoint sets of classes (i.e., client-exclusive classes), and the clients have a common goal which is to build a global classification model to identify the union of these classes. Such heterogeneity in client class sets poses a new challenge: how to ensure different clients are operating in the same latent space so as to avoid the drift after aggregation? We observe that the classes can be described in natural languages (i.e., class names) and these names are typically safe to share with all parties. Thus, we formulate the classification problem as a matching process between data representations and class representations and break the classification model into a data encoder and a label encoder. We leverage the natural-language class names as the common ground to anchor the class representations in the label encoder. In each iteration, the label encoder updates the class representations and regulates the data representations through matching. We further use the updated class representations at each round to annotate data samples for locally-unaware classes according to similarity and distill knowledge to local models. Extensive experiments on four real-world datasets show that the proposed method can outperform various classical and state-of-the-art federated learning methods designed for learning with non-IID data.

translated by 谷歌翻译

Smooth Mathematical Function from Compact Neural Networks

I. K. Hong

分类：神经与进化计算 | 机器学习

2022-12-31

This is paper for the smooth function approximation by neural networks (NN). Mathematical or physical functions can be replaced by NN models through regression. In this study, we get NNs that generate highly accurate and highly smooth function, which only comprised of a few weight parameters, through discussing a few topics about regression. First, we reinterpret inside of NNs for regression; consequently, we propose a new activation function--integrated sigmoid linear unit (ISLU). Then special charateristics of metadata for regression, which is different from other data like image or sound, is discussed for improving the performance of neural networks. Finally, the one of a simple hierarchical NN that generate models substituting mathematical function is presented, and the new batch concept ``meta-batch" which improves the performance of NN several times more is introduced. The new activation function, meta-batch method, features of numerical data, meta-augmentation with metaparameters, and a structure of NN generating a compact multi-layer perceptron(MLP) are essential in this study.

translated by 谷歌翻译

Skeletal Video Anomaly Detection using Deep Learning: Survey, Challenges and Future Directions

Pratik K. Mishra , Alex Mihailidis , Shehroz S. Khan

分类：计算机视觉

2022-12-31

The existing methods for video anomaly detection mostly utilize videos containing identifiable facial and appearance-based features. The use of videos with identifiable faces raises privacy concerns, especially when used in a hospital or community-based setting. Appearance-based features can also be sensitive to pixel-based noise, straining the anomaly detection methods to model the changes in the background and making it difficult to focus on the actions of humans in the foreground. Structural information in the form of skeletons describing the human motion in the videos is privacy-protecting and can overcome some of the problems posed by appearance-based features. In this paper, we present a survey of privacy-protecting deep learning anomaly detection methods using skeletons extracted from videos. We present a novel taxonomy of algorithms based on the various learning approaches. We conclude that skeleton-based approaches for anomaly detection can be a plausible privacy-protecting alternative for video anomaly detection. Lastly, we identify major open research questions and provide guidelines to address them.

translated by 谷歌翻译

Comparative Analysis of Clustering Techniques for Personalized Food Kit Distribution

Jude Francis , Rowan K Baby , Jacob Abraham , Ajmal P. S

分类：机器学习 | (统计)机器学习

2022-12-30

The Government of Kerala had increased the frequency of supply of free food kits owing to the pandemic, however, these items were static and not indicative of the personal preferences of the consumers. This paper conducts a comparative analysis of various clustering techniques on a scaled-down version of a real-world dataset obtained through a conjoint analysis-based survey. Clustering carried out by centroid-based methods such as k means is analyzed and the results are plotted along with SVD, and finally, a conclusion is reached as to which among the two is better. Once the clusters have been formulated, commodities are also decided upon for each cluster. Also, clustering is further enhanced by reassignment, based on a specific cluster loss threshold. Thus, the most efficacious clustering technique for designing a food kit tailored to the needs of individuals is finally obtained.

translated by 谷歌翻译

Informing selection of performance metrics for medical image segmentation evaluation using configurable synthetic errors

Shuyue Guan , Ravi K. Samala , Weijie Chen

分类：计算机视觉

2022-12-30

Machine learning-based segmentation in medical imaging is widely used in clinical applications from diagnostics to radiotherapy treatment planning. Segmented medical images with ground truth are useful for investigating the properties of different segmentation performance metrics to inform metric selection. Regular geometrical shapes are often used to synthesize segmentation errors and illustrate properties of performance metrics, but they lack the complexity of anatomical variations in real images. In this study, we present a tool to emulate segmentations by adjusting the reference (truth) masks of anatomical objects extracted from real medical images. Our tool is designed to modify the defined truth contours and emulate different types of segmentation errors with a set of user-configurable parameters. We defined the ground truth objects from 230 patient images in the Glioma Image Segmentation for Radiotherapy (GLIS-RT) database. For each object, we used our segmentation synthesis tool to synthesize 10 versions of segmentation (i.e., 10 simulated segmentors or algorithms), where each version has a pre-defined combination of segmentation errors. We then applied 20 performance metrics to evaluate all synthetic segmentations. We demonstrated the properties of these metrics, including their ability to capture specific types of segmentation errors. By analyzing the intrinsic properties of these metrics and categorizing the segmentation errors, we are working toward the goal of developing a decision-tree tool for assisting in the selection of segmentation performance metrics.

translated by 谷歌翻译